Diverging co-translational protein complex assembly pathways are governed by interface energy distribution

Protein-protein interactions are at the heart of all cellular processes, with the ribosome emerging as a platform, orchestrating the nascent-chain interplay dynamics. Here, to study the characteristics governing co-translational protein folding and complex assembly, we combine selective ribosome profiling, imaging, and N-terminomics with all-atoms molecular dynamics. Focusing on conserved N-terminal acetyltransferases (NATs), we uncover diverging co-translational assembly pathways, where highly homologous subunits serve opposite functions. We find that only a few residues serve as “hotspots,” initiating co-translational assembly interactions upon exposure at the ribosome exit tunnel. These hotspots are characterized by high binding energy, anchoring the entire interface assembly. Alpha-helices harboring hotspots are highly thermolabile, folding and unfolding during simulations, depending on their partner subunit to avoid misfolding. In vivo hotspot mutations disrupted co-translational complexation, leading to aggregation. Accordingly, conservation analysis reveals that missense NATs variants, causing neurodevelopmental and neurodegenerative diseases, disrupt putative hotspot clusters. Expanding our study to include phosphofructokinase, anthranilate synthase, and nucleoporin subcomplex, we employ AlphaFold-Multimer to model the complexes’ complete structures. Computing MD-derived interface energy profiles, we find similar trends. Here, we propose a model based on the distribution of interface energy as a strong predictor of co-translational assembly.

Supplementary Figure 1: NatA and NatB subunits` functionality and structural analysis (a) GFP tagging does not affect the growth of the NatA and NatB subunits in Saccharomyces cerevisiae.Growth assay on YPD at 30 °C and at 37 °C of the indicated strains vs. wildtype (WT).All the tagged strains show no impact on growth.n = 3, a representative image is shown.
(b-d) RMSF of the free catalytic subunits of Saccharomyces cerevisiae NatA and NatB and Candida albicans NatB.All subunits were simulated at 30 °C, 50 °C, and 100 °C for 300 ns (x3 repeats).At all temperatures, the higher mobility of co-translationally dependent subunits (Naa20) remained, excluding Naa10's long, polar loops.(e-g) RMSF of the free auxiliary subunits of Saccharomyces cerevisiae NatA and NatB and Candida albicans NatB.All subunits were simulated at 30 °C, 50 °C, and 100 °C for 300 ns (x3 repeats).At all temperatures, the higher mobility of co-translationally dependent subunits (Naa15) remained.
(h) RMSF boxplots of the different regions of the catalytic and auxiliary subunits at 50 °C for 300 ns (x3 repeats).The regions in the catalytic subunits boxplot (h) include amino acids 1-50, the minimal region (before the onset of co-translational interactions), the post-minimal region, and the complete protein.For the auxiliary subunits boxplot (i), the included regions are the minimal region (before the onset of co-translational interactions), the post-minimal region, and the complete protein.Minimal regions are determined by co-translational interaction onset in S. cerevisiae and its equivalent in C. albicans as calculated by TM-Align.
(j-m) RMSD of the full-length subunits of the catalytic subunits (i) and auxiliary subunits (j) at 30 °C, 50 °C, and 100 °C for 300 ns (x3 repeats), as well as RMSD of the TM-Aligned cores of the catalytic (k) and auxiliary subunits (l) at 30 °C, 50 °C, and 100 °C for 300 ns (x3 repeats).(c) The predicted aligned error of the entire complex shows a low error for the distance of residues including between the two subunits.
(d) pLDDT score per residue along the protein.
(e) Predicted aligned error of all AlphaFold-Multimer models, ordered by rank for each complex.The color at (x, y) indicates AlphaFold's expected position error at residue x if the predicted and true structures were aligned on residue y, from high error in red (30) to blue (0) for low error.For all models, the error is overall low for each protein's relative alignment with itself.The error is also low at the interface between the proteins.Black lines separate the subunits of the same complex in the same order as indicated on the left.If the PAE is generally low for residue pairs xy from two different domains, it indicates that AlphaFold predicts well-defined relative positions and orientations for them.
(f) Comparison of the energy profiles of NatB's subunits -AlphaFold-Mm-generated versus a solved structure (PDB: 8BIP).Left: a comparison of the catalytic subunit; Right: a comparison of the auxiliary subunit.(a) A model of human NatA heterodimer generated by AlphaFold.The catalytic subunit Naa10 is in green, and the auxiliary subunit is in tan.
(b) Human Naa10 from the solved structure PDB: 6C9M 2 , highlighting complex interface hotspots in red.NatA complex interface energy contribution per residue in a range of -2 (red) -0 (grey) ΔΔG [kcal/mol], computed from 300 ns MD simulations, using 1000 frames from the last 20ns, with pyDock bindEy.α-helices at positions: aa 8-20 and aa 28-37, centered, identified as clustering residue hotspots contributing the most energy to the interface.
(c) Wildtype human Naa10 highlighting disease mutants D10G, L11R and S37P (stick representation).All residues are located at the two highly energetic helices.
(d-e) Free human Naa10 subunit thermostability in wildtype (d) compared to disease mutants (e) D10G, L11R, and S37P.Conformational changes predicted by MD simulations at 30°C, over 300 ns timeframe.Timepoint 0 ns of the simulations as displayed in (c).Only the frame at 300 ns is shown for all.The mutated proteins were obtained by replacing the wildtype residues and equilibration before running production.The MD simulations show the mutants impact the conformation of the two α-helices harboring many hotspots.
(f-g) RMSD for wildtype, D10G, L11R, and S37P computed for the two α-helices.The RMSD boxplots (g) indicate that the conformation of the wildtype is maintained during the simulation while the mutants change conformation relative to the starting point (unpaired two-sample t-test).
(h) RMSF of the wildtype and the mutants, per residue, along the ORF.
(i-l) RMSF boxplots of the entire protein or the indicated segments, demonstrating the higher fluctuations of the mutants at the two α-helices (unpaired two-sample t-test).

( n )
NatA and NatB subunits solubility analysis, determined by localization patterns changes.Endogenously tagged Naa20-GFP Saccharomyces cerevisiae cells were grown to Log-phase (30 °C).Cells were then fixed and subjected to confocal microscopy.Representative images are shown.Scale bar, 4 μm.No significant aggregation of GFP tagged Naa20 (Catalytic subunit, NatB) was observed in either wildtype or naa10∆ (Catalytic subunit NatA complex deletion) strain.n > 150 cells.Supplementary Figure 2: TRiC/CCT chaperone binding sites.As pink spheres, TRiC/CCT chaperone binding sites of Saccharomyces cerevisiae Naa10 (as green ribbon), as reported in Stein et al. (2019) 1 .Naa15 is represented as tan ribbon (minimal region) and grey ribbon (post-minimal region).Western blot analysis of immunoprecipitated HA-tagged Naa10.Western blot against HA (rabbit) after IP against HA (mouse, IgG2a).10% of the IP product was used for Western blot while the rest was used to continue the RIP-qPCR.The ladder used is the Thermo Scientific PageRuler™ Prestained Protein Ladder.(b) Growth curves analysis of the indicated mutated strains.Averaged OD595 of three biological replicates, each representing a growth curve of one strain.In dashed line is the mean and the degree of experimental variation (standard deviation) is shaded in the corresponding color.
Comparison of a model generated with AlphaFold-Multimer, on the left, and a solved structure of Saccharomyces cerevisiae NatB (PDB 8BIP), on the right.Colored in green are the catalytic subunits and the auxiliary subunits are in tan (8BIP) or yellow (AF-Mm model).RMSD between the complexes is less than 1 Å (RMSD between 776 pruned atom pairs is 0.727 Å, or 0.953 Å across all 796 pairs).(b)pLDDT score per residue overlaid on the AlphaFold Multimer-generated model of S. cerevisiae NatB.This model has a very high ipTM (interface predicted TM-score) score of 0.93 with pTM = 0.933, and a pLDDT (per residue confidence score) of each subunit being higher than 92 (model confidence = 0.8 • ipTM + 0.2 • pTM = 0.931).
Naa10 disease mutants' MD structural analysis reveals a significant impact on alpha helices harboring predicted interface hotspots.

Table 3 .
Ribosome binding sites -experimental and predicted binding sites in auxiliary unit.The binding residues of 6HD7 are Cα-atoms within 4 Å from any heavy atoms of the ribosome.

Table 4 .
pyDock-calculated binding free energies of all complexes, parts, and mutants.pyDock energy in [kcal/mol] by contributions from electrostatic, desolvation, and van der Waals energy, for the S. cerevisiae NatA and NatB, and C. albicans NatBas generated by AlphaFold-Multimer and by pyDock followed by long molecular dynamics (1.7 μs).The loops in ScNatA provide more than half of the binding free energy.Both construction methods of the ScNatB model, AlphaFold-Multimer and pyDock followed by MD, provided similar results in terms of the binding free energy.

Table 5 .
Scores of the highest-ranking model for each complex.

Table 10 .
pyDock bindEy energy contributions [kcal/mol] from each component for 6HD5 at four time points at 25 °C.300 ns long simulations

Table 12 .
List of strains used.

Table 13 .
List of primers used for qPCR.

Table 16 .
Molecular dynamics checklist.Then, is it described in the text how simulations are split into equilibration and production runs and how much data were analyzed from production runs?Are calculations provided that can connect to experiments (e.g.loss or gain in function from mutagenesis, binding assays, NMR chemical shifts, Jcouplings, SAXS curves, interaction distances or FRET distances, structure factors, diffusion coefficients, bulk modulus and other mechanical properties, etc.)?